Gene prediction by multiple syntenic alignment

نویسندگان

  • Said Sadique Adi
  • Carlos Eduardo Ferreira
چکیده

Given the increasing number of available genomic sequences, one now faces the task of identifying their functional parts, like the protein coding regions. The gene prediction problem can be addressed in several ways. One of the most promising methods makes use of similarity information between the genomic DNA and previously annotated sequences (proteins, cDNAs and ESTs). Recently, given the huge amount of newly sequenced genomes, new similarity-based methods are being successfully applied in the task of gene prediction. The so-called comparative-based methods lie in the similarities shared by regions of two evolutionary related genomic sequences. Despite the number of different gene prediction approaches in the literature, this problem remains challenging. In this paper we present a new comparative-based approach to the gene prediction problem. It is based on a syntenic alignment of three or more genomic sequences. With syntenic alignment we mean an alignment that is constructed taking into account the fact that the involved sequences include conserved regions intervened by unconserved ones. We have implemented the proposed algorithm in a computer program and confirm the validity of the approach on a benchmark including triples of human, mouse and rat genomic sequences.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Gene Prediction by Syntenic Alignment

Abstract. Given the number of available genomic DNA, one now faces the task of identifying the functional parts of such raw sequence data, like the protein-coding regions. The gene prediction problem can be addressed in several ways. The most recently methods make use of the similarities between regions of two unannotated genomic sequences in order to find their genes. In this paper we present ...

متن کامل

Genome-Wide Comparative in silico Analysis of Calcium Transporters of Rice and Sorghum

The mechanism of calcium uptake, translocation and accumulation in Poaceae has not yet been fully understood. To address this issue, we conducted genome-wide comparative in silico analysis of the calcium (Ca(2+)) transporter gene family of two crop species, rice and sorghum. Gene annotation, identification of upstream cis-acting elements, phylogenetic tree construction and syntenic mapping of t...

متن کامل

Techniques for multi-genome synteny analysis to overcome assembly limitations.

Genome scale synteny analysis, the analysis of relative gene-order conservation between species, can provide key insights into evolutionary chromosomal dynamics, rearrangement rates between species, and speciation analysis. With the rapid availability of multiple genomes, there is a need for efficient solutions to aid in comparative syntenic analysis. Current methods rely on homology assessment...

متن کامل

Genomic features in the breakpoint regions between syntenic blocks

MOTIVATION We study the largely unaligned regions between the syntenic blocks conserved in humans and mice, based on data extracted from the UCSC genome browser. These regions contain evolutionary breakpoints caused by inversion, translocation and other processes. RESULTS We suggest explanations for the limited amount of genomic alignment in the neighbourhoods of breakpoints. We discount infe...

متن کامل

SLAM: cross-species gene finding and alignment with a generalized pair hidden Markov model.

Comparative-based gene recognition is driven by the principle that conserved regions between related organisms are more likely than divergent regions to be coding. We describe a probabilistic framework for gene structure and alignment that can be used to simultaneously find both the gene structure and alignment of two syntenic genomic regions. A key feature of the method is the ability to enhan...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • J. Integrative Bioinformatics

دوره 2  شماره 

صفحات  -

تاریخ انتشار 2005